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(54) System, method and computer program product for resource discovery in a distributed 
computing environment 



(57) A large distributed enterprise includes comput- 
ing resources including a management server servicing 
a plurality of endpoint machines. A management infra- 
structure include a runtime engine is deployed on given 
endpoint machines. In response to a task deployment 



request at an administrative server, discovery agents 
may be launched into the computer network. When a 
software agent arrives at a given machine that supports 
the runtime engine, the agent is executed to determine 
whether the endpoint is a candidate for a particular task 
deployment. 
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Description 

[0001] The pres nt invention is directed to managing 
a large distributed computer nt rpris network and, 
mor particularly, to performing discov ry operations 
th r in pr ferably using software components that are 
deployed in the network and adapted to be executed in 
local runtime environments. 

[0002] Today, companies desire to place all of their 
computing resources on the company network. To this 
end, it is known to connect computers in a large, geo- 
graphically-dispersed network environment and to man- 
age such an environment in a distributed manner. One 
such management framework comprises a server that 
manages a number of nodes, each of which has a local 
object database that stores object data specific to the 
local node. Each managed node typically includes a 
management framework, comprising a number of man- 
agement routines, that is capable of a relatively large 
number (e.g., hundreds) of simultaneous network con- 
nections to remote machines. As the number of man- 
aged nodes increases, the system maintenance prob- 
lems also increase, as do the odds of a machine failure 
or other fault. 

[0003] The problem is exacerbated in a typical enter- 
prise as the node number rises. Of these nodes, only a 
small percentage are file servers, name servers, data- 
base servers, or anything but end-of-wire or "endpoint" 
machines. The majority of the network machines are 
simple personal computers ("PC's") or workstations that 
see little management activity during a normal day. 
[0004] System administrators typically manage such 
environments through system and network tasks that 
are configured by the administrator on some local ma- 
chine and then distributed or deployed into the network. 
A machine that is to receive the task is referred to as a 
deployment target". The locations and characteristics 
of the target machines, however, are typically deter- 
mined by the administrator manually. Thus, for example, 
if the task to be deployed is a database management 
application, the administrator must specify the particular 
database servers in the network. This process is cum- 
bersome and time-consuming, especially as the size of 
the network increases to include thousands of connect- 
ed machines. If the system administrator does not spec- 
ify all target machines, the system administration task 
may be implemented incorrectly. Alternatively, if the 
number and location of targets is over-specified, net- 
work resources are consumed unnecessarily. 
[0005] In addition, there are many other reasons why 
network administrators have an interest in performing 
so-called "discovery" operations in such a large man- 
aged environment. As one example, an administrator 
may desire to determine how many and which machines 
in the environment presently support a given version of 
a software program. Discovery may also b required to 
d t rmin wh ther a particular machin has sufficient 
resource (e.g., available disk storage) to support a soft- 



ware upgrade. Yet anoth r reason to perform a discov- 
ery operation might simply involv a n ed or desir to 
p rform system or resourc inv ntory to facilitat plan- 
ning for futur enterpris xpanston. The nature and 
s types of discov ry operations are thus quit varied. 
[0006] Known distributed management architectures 
do not afford the system administrator the ability to issue 
a distribution request and deploy a task without having 
to manually associate the tasks with given groups of ma- 
rt? chines. Likewise, such known techniques have not been 
readily adapted to facilitate a wide range of basic dis- 
covery operations that are desired to facilitate system 
administration, management and maintenance in such 
an environment, especially as the network grows to in- 
is elude thousands of connected, managed machines. 
[0007] The present invention addresses these and 
other associated problems of the prior art. 
[0008] It is thus a primary object of this invention to 
perform discovery operations in a distributed computer 
20 enterprise environment in which a large number of ma- 
chines are connected and managed. 
[0009] It is another primary object of this invention to 
deploy software discovery agents in the distributed com- 
puter network that are executed in local runtime envi- 
25 ronments to perform such discovery operations. 

[001 0] Another primary objective of this invention is to 
provide software components that are readily deployed 
into a distributed, managed environment for discovering 
given facts (e.g., machine and/or source identity, char- 
so acteristics, state, status, attributes, and the like) that are 
then useful in controlling a subsequent operation (e.g., 
a task deployment). 

[001 1 ] A more specific object of this invention is to pro- 
vide a mechanism by which a dispatcher may identify 
35 particular machines that are candidates to receive a task 
deployment so that an administrator or other user need 
not manually associate the task with given groups of ma- 
chines. 

[0012] It is a particular object of this invention to de- 

40 ploy a Java-based software "discovery agent" into a dis- 
tributed computer network environment to discover par- 
ticular machines or resources that are to be targeted to 
receive a particular task deployment within the network. 
[001 3] A further object of this invention to launch a set 

45 of one or more discovery agents into a large, distributed 
computer network in response to a given request for the 
purpose of identifying and locating suitable target ma- 
chines or resources for receipt of a given task. The task 
may be an administrative task, a management task, a 

so configuration task, or any other application. 

[0014] A further specific object of this invention is to 
customize or tailor the software agent dispatched in the 
network for discovery purposes as a function of the type 
of task to be subsequently deployed. Thus, the software 

ss agent may more readily determine whether a candidate 
machin may qualify as a potential target for the deploy- 
ment. 

[0015] Yet another mor general object of this inven- 
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tion is to more fully automate th discovery of distribu- 
tion targets in a larg , distributed computing network 
and thereby reduce th expens andcompl xityofsys- 
t m administration. 

[0016] Another object of the present inv ntton is to in- 
itially dispatch a minimum amount of code that may be 
necessary to discover distribution targets for a subse- 
quent task deployment in a large computer network. 
[0017] It is a further object of this invention to deploy 
a self-routing software agent into a distributed computer 
network to discover workstations that satisfy a given cri- 
teria. During a particular search, a given agent may 
"clone 1 itself at a particular node to continue the search 
along a new network path. 

[001 8] Yet another more general object of the present 
invention is to collect information about workstations in 
a large computer networked environment as mobile dis- 
covery agents are dispatched and migrated throughout 
the network. 

[0019] These and other objects of the invention are 
achieved by the disclosed system, method and compu- 
ter product for discovery in a large, distributed computer 
networking environment. A management infrastructure 
supported in the networking environment includes a dis- 
patch mechanism, which is preferably located at a cen- 
tral location (e.g., an administrative server), and a runt- 
ime environment supported on given nodes of the netr 
work. In particular, the runtime environment (e.g., an en- 
gine) is preferably part of a distributed framework sup- 
ported on each managed node of the distributed enter- 
prise environment. 

[0020] One preferred method begins upon a distribu- 
tion request. The distribution request is not limited to any 
particular type of system or network administration, con- 
figuration or management task. In response to the re- 
quest, the dispatch mechanism determines whether the 
machines targeted for the deployment (namely, the "tar- 
get machines") can be identified from local sources (e. 
g., a local repository of previously-collected or generat- 
ed configuration information). If such information is not 
available or it otherwise not useful, the dispatch mech- 
anism deploys into the network a set of one or more "dis- 
covery agents" that are tasked to locate and identify suit- 
able target(s) for the deployment. These one or more 
agents then "fan-out" into the network to collection in- 
formation to facilitate subsequent task deployment. 
Preferably, the discovery agent is a small piece of code 
that is customized or tailored as a function of the partic- 
ular task to be later deployed. This customization reduc- 
es the time necessary to complete an overall search be- 
cause the agent thus may be "tuned" to evaluate the 
candidate node for a particular characteristic. If that 
characteristic is not present, the software agent may 
then proceed elsewhere (or clone itself to follow a new 
network path). 

[0021] Wh n a particular discovery agent arrives at a 
node in th network, th softwar agent preferably is 
linked into the local runtime nvironment already 



pr sent to th reby initiate a local discovery process. Th 
discovery routine executed by the discov ryag ntmay 
discov r that th local machin (or som r sourc or 
application thereon) is a suitabl target, that th local 
5 machin (or some application thereon) is not a suitable 
target, or that insufficient information is available to 
make this determination. Based on information obtained 
during the discovery process, the software agent also 
may identify one or more new network paths that must 
be traversed to continue the discovery process and 
thereby complete the search. The software agent may 
then launch itself to another node, or it may "clone" itself 
and launch a "cloned" agent over the new network path 
as needed. 

[0022] If the software agent discovers that the candi- 
date machine is a suitable target, certain identifying in- 
formation (e.g., a confirmation, a machine identifier, a 
state identifier or the like) is generated. The identifying 
information is then saved within a datastore associated 
with the agent (if the agent is to return to the dispatch 
mechanism) or, alternatively, such information is trans- 
mitted back to the dispatch mechanism (if the agent is 
to extinguish itself upon completion of the discovery 
process). Such transmission may be effected using a 
simple messaging technique. When a given network 
path is exhausted, the discovery agent then either re- 
turns to the dispatch mechanism or extinguishes itself, 
as the case may be. 

[0023] Thus, at each node, the software agent is pref- 
erably run by the runtime engine previously deployed 
there. Alternatively, the software agent runs as a stan- 
dalone process using existing local resources. When 
the suitability of the workstation (as a target machine) is 
indeterminate, the software agent may obtain additional 
code from the dispatch mechanism or from some other 
network source to facilitate its determination. Such ad- 
ditional code may be another software agent. 
[0024] While one preferred "discovery" operation in- 
volves a determination of whether a given machine or 
resource is a suitable target for a task deployment, other 
more discovery operations may be implemented in like 
manner. Thus, a discovery operation may be imple- 
mented for inventory control, for determining which ma- 
chines support which versions of given software, for de- 
termining the ability of a given machine or an associated 
resource to support given software or to perform a given 
task, and the like. 

[0025] Embodiments of the invention will now be de- 
scribed with reference to the accompanying drawings, 
in which: 

Figure 1 illustrates a simplified diagram showing a 
large distributed computing enterprise environment 
in which the present invention is implemented; 

Figure 2 is a block diagram of a preferred system 
manag ment framework illustrating how th fram - 
work functionality is distributed across the gateway 
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and its endpoints within a managed r gion; 

Ffgur 2A is a block diagram of th lements that 
compris th LCF client compon nt of th system 
management framework; 

Figure 3 illustrates a smaller "workgroup" imple- 
mentation of the enterprise in which the server and 
gateway functions are supported on the same ma- 
chine; 

Figure 4 is a distributed computer network environ- 
ment having a management infrastructure for use 
in carrying out the preferred method of the present 
invention; 

Figure 5 is a flowchart illustrating a preferred meth- 
od of deploying a software discovery agent in re- 
sponse to a distribution request in the computer net- 
work; and 

Figure 6 is a flowchart of a software agent local dis- 
covery mechanism according to the preferred em- 
bodiment of this invention. 

[0026] Referring now to Figure 1, the invention is 
preferably implemented in a large distributed computer 
environment 10 comprising up to thousands of "nodes. 
" The nodes will typically be geographically dispersed 
and the overall environment is "managed" in a distribut- 
ed manner. Preferably, the managed environment (ME) 
is logically broken down into a series of loosely-connect- 
ed managed regions (MR) 12, each with its own man- 
agement server 14 for managing local resources with 
the MR. The network typically will include other servers 
(not shown) for carrying out other distributed network 
functions. These include name servers, security serv- 
ers, file servers, threads servers, time servers and the 
like. Multiple servers 14 coordinate activities across the 
enterprise and permit remote site management and op- 
eration. Each server 14 serves a number of gateway 
machines 16, each of which in turn support a plurality of 
endpoints 1 8. The server 1 4 coordinates all activity with- 
in the MR using a terminal node manager 20. 
[0027] Referring now to Figure 2, each gateway ma- 
chine 16 runs a server component 22 of a system man- 
agement framework. The server component 22 is a mul- 
tithreaded runtime process that comprises several 
components: an object request broker or "ORB" 21, an 
authorization service 23, object location service 25 and 
basic object adaptor or "BOA" 27. Server component 22 
also includes an object library 29. Preferably, the ORB 
21 runs continuously, separate from the operating sys- 
tem, and it communicates with both server and client 
process s through separate stubs and skeletons via an 
interprocess communication (IPC) facility 19. In partic- 
ular, a secur r mot procedure call (RPC) is used to 
invok operations on remote objects. Gateway machine 



16 also includes an operating system 15 and a threads 
mechanism 17. 

[0028] Th system management f ram workinclud s 
a client component 24 supported on each of th end- 
s point machin s 18. The client compon nt 24 is a low 
cost, low maintenance application suite that is prefera- 
bly "dataless" in the sense that system management da- 
ta is not cached or stored there in a persistent manner. 
Implementation of the management framework in this 
10 "client-server" manner has significant advantages over 
the prior art, and it facilitates the connectivity of personal 
computers into the managed environment. Using an ob- 
ject-oriented approach, the system management frame- 
work facilitates execution of system management tasks 
is required to manage the resources in the MR. Such tasks 
are quite varied and include, without limitation, file and 
data distribution, network usage monitoring, user man- 
agement, printer or other resource configuration man- 
agement, and the like. 

[0029] In the large enterprise such as illustrated in 
Figure 1 , preferably there is one server per MR with 
some number of gateways. For a workgroup-size instal- 
lation (e.g., a local area network) such as illustrated in 
Figure 3, a single server-class machine may be used 
as the server and gateway, and the client machines 
would run a low maintenance framework References 
herein to a distinct server and one or more gateway(s) 
should thus not be taken by way of limitation as these 
elements may be combined into a single platform. For 
intermediate size installations the MR grows breadth- 
wise, with additional gateways then being used to bal- 
ance the load of the endpoints. 
[0030] The server is the top-level authority over all 
gateway and endpoints. The server maintains an end- 
point list, which keeps track of every endpoint in a man- 
aged region/This list preferably contains all information 
necessary to uniquely identify and manage endpoints 
including, without limitation, such information as name, 
location, and machine type. The server also maintains 
the mapping between endpoint and gateway, and this 
mapping is preferably dynamic. 
[0031] As noted above, there are one or more gate- 
ways per managed region. Preferably, a gateway is a 
fully-managed node that has been configured to operate 
as a gateway. As endpoints login, the gateway builds an 
endpoint list for its endpoints. The gateway's duties pref- 
erably include: listening for endpoint login requests, lis- 
tening for endpoint update requests, and (its main task) 
acting as a gateway for method invocations on end- 
points. 

[0032] As also discussed above, the endpoint is a ma- 
chine running the system management framework client 
component, which is referred to herein as the low cost 
framework (LCF). The LCF has two main parts as illus- 
trated in Figure 2 A: th LCF da mon 24a and an appli- 
cation runtime library 24b. Th LCF da mon 24a is re- 
sponsibl for ndpoint login and for spawning applica- 
tion ndpoint executables. One an xecutable is 
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spawned, th LCF daemon 24a has no further interac- 
tion with it. Each xecutable is linked with th application 
runtime library 24b, which handles all furth r communi- 
cation with th gat way. 

[0033] Preferably, th server and each of th gat - 
ways is a comput r or 'machine." For xample, ach 
computer may be a RISC System/6000® (a reduced in- 
struction set or so-called RISC-based workstation) run- 
ning the AIX( (Advanced Interactive Executive) operat- 
ing system, preferably Version 3.2.5 or greater. Suitable 
alternative machines include: an IBM-compatible PC 
x86 or higher running Novell UnixWare 2.0, an AT&T 
3000 series running AT&T UNIX SVR4 MP-RAS Re- 
lease 2.02 or greater, Data General AViiON series run- 
ning DG/UX version 5.4R3.00 or greater, an 
HP9000/700 and 800 series running HP/UX 9.00 
through HP/UX 9.05. Motorola 88K series running SVR4 
version R40V4.2, a Sun SPARC series running Solaris 
2.3 or 2.4, or a Sun SPARC series running SunOS 4. 1 .2 
or 4.1.3. Of course, other machines and/or operating 
systems may be used as well for the gateway and server 
machines. 

[0034] Each endpoint is also a computer. In one pre- 
ferred embodiment of the invention, most of the end- 
points are personal computers (e.g., desktop machines 
or laptops). In this architecture, the endpoints need not 
be high powered or complex machines or workstations. 
One or more of the endpoints may be a notebook com- 
puter, e.g., the IBM ThinkPad® machine, or some other 
Intel x86 or Pentium®-based computer running Win- 
dows '95 or greater operating system. IBM® or IBM- 
compatible machines running under the OS/2® operat- 
ing system may also be implemented as the endpoints. 
An endpoint computer preferably includes a browser, 
such as Netscape Navigator or Microsoft Internet Ex- 
plorer, and may be connected to a gateway via the In- 
ternet, an intranet or some other computer network. 
[0035] A preferred embodiment of the present inven- 
tion is implemented in the enterprise environment as il- 
lustrated in Figure 4. As will be discussed below, a set 
of software "discovery agents* are available at a central 
location (e.g., manager 14) or at a plurality of locations 
(e.g., the gateways 16) in the network where adminis- 
trative, configuration or other management tasks are 
specified, configured and/or deployed. The software 
agents are "mobile" in the sense that the agents are dis- 
patched (as will be described below) from a dispatch 
mechanism and then migrate throughout the network 
environment. 

[0036] G enerally, the mobile software agents traverse 
the network to perform so-called "discovery" operations. 
The particular types of discovery operations may be 
quite varied. Thus, for example, a particular discovery 
operation may be initiated by a user at a managing re- 
source through a conventional graphical user interface 
(GUI) one th discovery application is started. This op- 
ration may simply issue on ormor discovery agents 
to query each of s t of machines (at which a given agent 



is executed) to determin the machine "type". The dis- 
covery operation may identify a list of resources asso- 
ciated with th given machine. The discov ry op ration 
may id ntifywheth rth given machin hasaresourc 
s of a particular type. Another discovery operation may 
simply query the machine to discover whether the ma- 
chine or some associated resource has a given charac- 
teristic. An example of the latter situation is where a dis- 
covery operation is initiated at the given machine to de- 
termine whether a specific resource (e.g., a disk drive 
partition) meets some defined criteria (e.g., storage 
space). The particular discovery operation thus may be 
quite general or very specific, and the given operation 
may relate to an existing state (e.g., existing resources 
or their operational state) or, alternatively, to determine 
whether the machine can support other resources in the 
future (e.g., for a planned system expansion). An exam- 
ple of the latter situation is when the network adminis- 
trator desires to perform an inventory of existing ma- 
chines to determine which of those machines might re- 
quire a software upgrade. 

[0037] In a representative application, a network ad- 
ministrator desires to monitor a given resource in the 
distributed computer environment. In such case, of 
course, the nature of the discovery agent may be directly 
linked to the monitoring component on whose behalf it 
is working. Thus, for a monitoring component that wish- 
es to monitor some metric available only from particular 
operating systems, a discovery agent would then report 
successful discovery only on such systems. 
[0038] Some monitoring components may be intend- 
ed to monitor certain resource types wherein several in- 
stances of the resource may be present on any given 
computer. Such resources include, for example, disk 
drives or components thereof (namely, file systems), 
processes of a particular type, log files of a particular 
type, and the like. The discovery agents for these com- 
ponents may then be designed to find instances of such 
resources on agiven computer and then, if desired, to 
cause the instantiation of a copy of the monitoring com- 
ponent for each resource found. 
[0039] Some discovery agents are designed to con- 
tinually monitor the state of a system so that resources 
that dynamically appear and disappear may be tracked. 
Such agents typically scan for resource instances ac- 
cording to some simple scheduling metric. An example 
of such a resource is an active connection between a 
client and some software server process. 
[0040] As a result of the discovery process, it is often 
the case that information discovered is collected and de- 
sired to be returned to the dispatcher or some other lo- 
cation. The particular information returned to the user 
will necessarily depend on the type of discovery opera- 
tion initiated. The presentation and formatting of such 
information is a matter of design choice and is not a lim- 
itation of the present invention. To give an xampl , if 
th discov ry op ration m rely s eks the identity of all 
machines that have "version x.y" of a given software 
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routine, then th r turned information may b a mere 
ordered list of the identified resourc s displayed in a 
scrollabl listbox or other known GUI construct. 
[0041] In th illustrative embodiment discussed be- 
low, the discov ry operation d t rmin s which ma- 5 
chines in the managed environm nt are to receive a par- 
ticular task deployment, e.g., a distributed monitoring 
application for use in managing resources throughout 
the distributed network. Although this example is de- 
scribed at length, one of ordinary skill will appreciate that 
the nature, type and characteristics of particular discov- 
ery operations may be quite varied depending on the 
circumstances. The present invention is not limited to 
any particular discovery operation or any defined set of 
such operations. 

[0042] In this illustrative embodiment, a particular task 
to be deployed in the environment may be specified but 
the target machines may not be readily ascertainable. 
In such case, an appropriate ■discovery' agent is iden- 
tified and dispatched to determine this information. If the 
agent does not find a target machine at the initial loca- 
tion examined, the agent (or a clone thereof) then mi- 
grates through the network to continue the search. The 
agent preferably chooses its path through the network 
based on the information received at the dispatching lo- 
cation, as well as optionally from information gleaned 
from each examined location. As will be seen, the par- 
ticular "path" typically varies as the software discovery 
agent migrates through the network because informa- 
tion gleaned from a particular node may redirect the dis- 
covery agent in some given manner. 
[0043] For illustrative purposes only, one such discov- 
ery method is implemented in the large, distributed en- 
terprise environment shown in Figure 4, although this 
is not a limitation of the invention. In this example, the 
manager 14 includes the dispatch mechanism 35 hav- 
ing a set of software agents 37 associated therewith. 
Alternatively, dispatch mechanism 35 may include a set 
of configurable software tasks 39 from which one or 
more agents are constructed. Manager 1 4 preferably al- 
so includes a database 43 including information identi- 
fying a list of all machines in the distributed computing 
environment that are designed to be managed. The dis- 
patch mechanism itself may be distributed across mul- 
tiple nodes. 

[0044] At least some of the gateway nodes 1 6 and at 
least some of the terminal nodes 18 (or some defined 
subset thereof) include a runtime engine 41 that has 
been downloaded to the particular node via a distribu- 
tion service. The engine 41 provides a runtime environ- 
ment for the software agent Although not meant to be 
limiting, the particular distribution technique may involve 
a subscription process such as described in US Patent 
No. 5,838,918 (Docket No. AT9-96-077), titled 'Method 
For Managing Distributed Computer Network Configu- 
ration Information - and assigned to th assignee of the 
pr sent invention. Alternate ly, th diagnostic engines 
may be distributed to the various nodes via the tech- 



niqu described in US Application No. 08/089,964 
(Dock t No. AT9-97-395), titled "Drag And Drop T ch- 
niqu For Distributing Visual Builder Built Tasks In A 
Computer Network", which is also assigned to th as- 
signe of this invention. 

[0045] As noted above, the present invention auto- 
matical ty deploys one or more of the software discovery 
agents to perform a given discovery operation (e.g., to 
locate a particular machine, a resource thereon, or the 
like) to facilitate a particular administration, configura- 
tion or other management task (or perhaps some other 
service) specified by an administrator or other system 
entity. Preferably, the software agent is a software com- 
ponent (i.e. a piece of code) executed by the runtime 
engine located at the node at which the agent arrives. 
Alternatively, the software agent runs as a standalone 
application using local resources. Yet another alterna- 
tive is to have the software agent control a engine which, 
in turn, examines the host platform and then performs 
the discovery operation (in this example, determining 
the suitability of the host to receive the target task de- 
ployment). 

[0046] In a representative embodiment, both the runt- 
ime engine and the software agent(s) are conveniently 
written in Java. As is known in the art, Java is an object- 
oriented, multi-threaded, portable, platform-independ- 
ent, secure programming environment used to develop, 
test and maintain software programs. Java programs 
have found extensive use on the World Wide Web, 
which is the Internet's multimedia information retrieval 
system. These programs include full-featured interac- 
tive, standalone applications, as well as smaller pro- 
grams, known as applets, that run in a Java-enabled 
Web browser. 

[0047] In one particular embodiment of the present in- 
vention, a software agent is a Java applet (e.g., com- 
prised of a set of Java "class" files) and the runtime en- 
vironment includes a Java Virtual Machine (JVM) asso- 
ciated with a Web browser. In this illustrative example, 
various nodes of the network are part of the Internet, an 
intranet, or some other computer network or portion 
thereof. 

[0048] When the administrator configures a task for 
deployment, the dispatch mechanism compiles the ap- 
propriate Java class files (preferably based on the task 
or some characteristic thereof) and dispatches the ap- 
plet (as the software agent) in the network. Depending 
on the size, configuration and/or topology of the net- 
work, multiple agents may be dispatched. Each applet 
is then executed on the JVM located at a candidate node 
to determine whether the node is an appropriate target 
for the deployment of the task. 
[0049] Figure 5 is an illustrative discovery routine ac- 
cording to the present invention. Portions of this routine 
may and often do take place at different times and under 
diff r nt control circumstances. Th y are illustrated and 
described togeth rm r ty to simplify th d scription. 
[0050] Theroutin begins at st p 30 with the distribu- 
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tion of the runtime engines to th various nodes. In this 
mbodiment, the runtime engine may be part of the LCF 
runtim library 24B, as has b n pr viously described. 
Mor likely, th runtime engin is deployed before the 
runtim library during a prior nod configuration task. In s 
either case, these runtime engines collectively form a 
part of a management infrastructure of the enterprise 
environment. Once the management infrastructure is in 
place, the actual operating routine begins. 
[0051] At step 32, a test is performed at the dispatch 
mechanism 15 to determine whether a given occur- 
rence, e.g., a task deployment request, a method invo- 
cation, or the like, has been generated or received from 
elsewhere in the network. In the case of a task deploy- 
ment as described above, an administrator performs the 
desired task configuration using a conventional GUI. 
The particular task configuration or specification tech- 
nique is not part of the present invention. Step 32, of 
course, may represent any given function and is not lim- 
ited to mere task deployment. For the ease of further 
discussion, however, it is assumed that the discovery 
process is ancillary to one such deployment. 
[0052] If the outcome of the test at step 32 is negative, 
the routine cycles as shown. If, however, the outcome 
of the test at step 32 indicates that a task to be deployed 
has been specified, the method continues at step 34. 
[0053] At step 34, a test is performed to determine if 
the discovery process has been enabled by the system 
or system administrator, together with the nature of the 
specification. Although not meant to be limiting, the ad- 
ministrator may enable the discovery process (and 
specify the discovery operation in particular) by using 
the GUI, a command line interface (CLI) or any other 
known interface technique. If the outcome of the test at 
step 34, the routine terminates. If the discovery process 
has been enabled as indicated by a positive outcome to 
the test at step 34, the routine continues at step 36 to 
query a repository 43 (e.g., in the management server) 
to determine whether the target machines and their 
characteristics (e.g., location, state, status, configura- 
tion, and the like) have already been discovered or spec- 
ified. If the outcome of the test at step 36 is positive, the 
routine returns information to the dispatch mechanism 
at step 38 and the returned data is then instantiated as 
needed at step 40. If this path is taken, the routine then 
terminates because discovery is not needed. 
[0054] If, however, the result of the query at step 36 
indicates that the necessary information is not available 
from the repository (e.g., because such a repository 
does not exist, because certain information needed to 
tailor the distribution has not been collected, because 
information is outdated due to given aging factors, etc.), 
the routine continues at step 42. In particular, the task 
to be deployed is parsed to identify one or more search 
characteristics. This step may be carried out automati- 
cally or be controlled by information specified by the us- 
er ( .g., through th GUI). Thus, for example, if th task 
to be deployed is a database managem nt task that will 



b supported on databas servers, step 42 may identify 
a giv n characteristic of a candidate machine to facili- 
tat the search proc ss. In this xampl , that character- 
istic may b "machines with resid nt database s rv r 
softwar ' or the lik . To facilitat this proc ss, th GUI 
may display icons or other visual devices that may be 
selected to form associations with machines, resources 
or their attributes. Any convenient specification or se- 
lection mechanism may be implemented, of course. By 
identifying one or more characteristics of the task to be 
deployed, the inventive mechanism may tailor or cus- 
tomize a software agent to look for certain specific hard- 
ware, software or other components on a candidate ma- 
chine in a more efficient manner. 
[0055] To this end, the routine then continues at step 
44 to select, construct or subclass an appropriate soft- 
ware agent based on the given characteristics derived 
in step 42, upon some other user-selected or system- 
selected criteria, or based on some other information 
such as historical data. As used herein, the selection 
process of step 44 may involve compiling one or more 
software tasks into a 'custom' software agent for this 
purpose. Thus, the present invention covers the use of 
an existing software agent, as well as an agent that is 
created or generated "on-the-fly". 
[0056] At step 46, the software agent is deployed into 
the network. The agent includes appropriate routines 
designed to enable the code to be dropped into the local 
execution context and controlled to effect the specific 
discovery operation. Step 46 may involve deployment 
of multiple agents (dispatched concurrently or progres- 
sively) depending on the topology of the network. As 
previously noted, each software agent is a mobile "dis- 
covery" agent whose purpose is to discover the distri- 
bution information. This completes the discovery agent 
deployment routine. 

[0057] The flowchart of Figure 6 illustrates the discov- 
ery operation at a particular node. The routine begins at 
step 50 when a given software agent arrives at the node. 
Of course, because multiple agents may be dispatched 
within the network, the routine shown in Figure 6 may 
be carried out concurrently (or otherwise) on many dif- 
ferent nodes in the network. At step 52, the discovery 
agent is linked to the local automation engine. Such link- 
ing typically involves binding the software agent into the 
runtime environment. The local discovery process is 
then initiated at step 54. At step 56, a test is performed 
to determine whether the system under test meets a giv- 
en criteria (preferably as specified through the custom- 
ization process described above) . If the outcome of the 
test at step 56 is indeterminate, the routine cycles as 
illustrated. If, however, the outcome of the test at step 
56 indicates that the machine (or some given compo- 
nent thereof) satisfies the search criteria specified by 
the software agent, the routine branches to step 58. 
[0058] At this point in th routin , th softwar agent 
collects, compiles or oth rwise g n rates appropriate 
information that may be required or desired by the dis- 
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patch mechanism. This information includes, for xam- 
pl , information or data id ntifying the host platform, its 
location and other id ntifying charact ristics, informa- 
tion identifying a current stat of op rating components, 
and the like. Th particular type of information will vary 
depending on the task to be deployed or, more gener- 
ally, the nature of the discovery operation per se. At step 
60, a test is made to determine whether the software 
agent is self-extinguishing, i.e. whether the agent is to 
extinguish or 'kill" itself upon completion of the search 
over a given network path. If the outcome of the test at 
step 60 indicates that the software agent is self-extin- 
guishing, the routine continues at step 62 by transmitting 
the identifying information back to through the network, 
e.g., using a local message facility. If the software agent 
is not self-extinguishing, the identifying information is 
written to a datastore associated with the agent at step 
64. Such information is later disgorged when the soft- 
ware agent returns to the dispatch mechanism. Atypical 
datastore is located "within" the agent code itself. Alter- 
natively, the identifying information may be stored at the 
platform. 

[0059] Control then continues at step 66. Th is step is 
also reached in the main processing loop if the outcome 
of step 56 indicates that the platform does not qualify 
under the search criteria. At step 66, a test is performed 
to determine whether other additional network paths 
should be traversed. The criteria for determining this 
question may be simple, e.g., the particular machine is 
an endpoint (in which case, no further transversal is re- 
quired), or it be more complex. If the outcome of the test 
is negative, the routine either extinguishes the software 
agent (if the result of the test at step 60 was positive), 
or the software agent is launched back toward the dis- 
patch mechanism. This is step 68. If, however, the out- 
come of the test at step 66 is positive, there are addi- 
tional network paths to be traversed by the software 
agent 

[0060] The routine then continues at step 70 to test 
whether the software agent is to be cloned to continue 
the search. Under certain circumstances, e.g., where 
the software agent might be useful for some other diag- 
nostic purposes, it may be desirable to maintain the 
agent at the platform after the local discovery has been 
completed. Thus, for example, a future discovery oper- 
ation at the node may be simplified by having a previ- 
ously executed agent (or some portion thereof) already 
resident. 

[0061] Thus, the software agent generally includes 
the capability to return to the dispatcher, to remain at 
the node, or to clone and launch another instance of it- 
self to continue the search. If the outcome of the test at 
step 70 is positive, the software agent is cloned at step 
72 and then launched over an identified path at step 74. 
This routine continues in an it rativ manner until all 
softwar agents hav erth r xtinguished themselves or 
returned back to the dispatch r. Information returned to 
th dispatcher preferably is stored for r ference purpos- 



es to facilitat (e.g., narrow) future search "fields". 
[0062] Thus, th present invention provides a mech- 
anism for discov ring th locations and charact ristics 
of target workstations for som task to b d ployed in 
5 th nvironment. The task itself may b a softwar agent 
or any other type of application, process or other routine. 
If prior discovery has occurred, information derived 
therefrom may be used to facilitate the deployment. 
However, where such information is not available or is 
otherwise not useful (e.g., because it is outdated), one 
or more discovery agents are first launched to discover 
the required information. The discovered information is 
then returned to the dispatcher for use to facilitate the 
targeted 1 distribution of the task. 
[0063] One of ordinary skill will appreciate that the dis- 
patcher may direct the search strategy in one or more 
ways to reduce the number of software agents required 
or the number of nodes that must be visited to generate 
the list of target nodes. The software agent(s) might then 
be deployed to the general "target" area from which the 
specific target locations are then identified. 
[0064] In one preferred embodiment, the agent is an 
object composed of a set of tasks routable to appropri- 
ate systems in the large, distributed computer network. 
The set of tasks may be coupled together as may be 
necessary to diagnose and/or correct the fault. At each 
node, the agent is preferably incorporated into or other- 
wise executed by the previously-deployed runtime en- 
vironment. Thus, as a large portion (namely, the runtime 
engine) of the discovery capability is already at the sys- 
tem to be evaluated, network traffic is further minimized. 
[0065] Once the target machines have been identi- 
fied, the task is deployed to these machines or some 
other given action is taken. For example, a "map" of the 
target machines may be stored at the dispatcher or else- 
where to facilitate a subsequent deployment at a later 
time. Thus, the discovery mechanism is also useful for 
"charting" or "mapping" the topology of the networked 
environment for research or other purposes. When a 
task is later deployed, the deployment is focused to only 
those regions of the managed network that are required 
to receive the task. This greatly reduces bandwidth and 
thereby conserves network resources. 
[0066] A particular agent may not have the necessary 
code to determine whether the node is a suitable target. 
The agent may have the necessary code or it may send 
requests to the dispatch mechanism for additional code 
to effect the local discovery process. The additional 
code may be other software agent(s). 
[0067] The software agent is preferably a smallest 
amount of software code that is necessary to discover 
the target machine or to perform some task associated 
with the local discovery process. By distributing some 
of the discovery functionality in the engine, network 
bandwidth is conserved because only a small amount 
of code n eds to b dispatched to the target sit . This 
furth r reduces complexity and cost of syst ms man- 
agement in the large enterprise environm nt. 
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[0068] Pref rably, the client-class framework running 
on each ndpoint is a tow-maint nance, low-cost frame- 
work that is ready to do manag m nt tasks but con- 
sum s few machine resources (becaus it is normally 
in an idl stat ). Each endpoint preferably is "datal ss" 
in the sense that system management data is not stored 
therein before or after a particular system management 
task is implemented or carried out. This architecture ad- 
vantageously enables a rational partitioning of the en- 
terprise with 1 0's of servers, 1 00's of gateway machines, 
and 1000's of endpoints. Each server typically serves 
up to 200 gateways, each of which services 1000's of 
endpoints. At the framework level, all operations to or 
from an endpoint pass through a gateway machine. In 
many operations, the gateway is transparent; it receives 
a request, determines the targets, resends the requests, 
waits for results, then returns results back to the caller. 
Each gateway handles multiple simultaneous requests, 
and there may be any number of gateways in an enter- 
prise, with the exact number depending on many factors 
including the available resources and the number of 
endpoints that need to be serviced. 
[0069] In the preferred embodiment, these and other 
objects are thus achieved in a large distributed enter- 
prise that includes computing resources organized into 
one or more managed regions, each region being man- 
aged by a management server servicing one or more 
gateway machines, with each gateway machine servic- 
ing a plurality of endpoint machines. As has been illus- 
trated and described, a system management framework 
is preferably "distributed" on the gateway machines and 
the one or more endpoint machines to carry out system 
management tasks. Although the above environment is 
preferred, one of ordinary skill will appreciate that the 
inventive concepts may be implemented in smaller dis- 
tributed client server network environments. Thus, the 
invention should not be construed to be limited to a par- 
ticular large scale, distributed computing environment 
as described in the preferred embodiment. 
[0070] One of the preferred implementations of the in- 
vention is as a set of instructions in a code module res- 
ident in the random access memory of a computer. Until 
required by the computer, the set of instructions may be 
stored in another computer memory, for example, in a 
hard disk drive, or in a removable memory such as an 
optical disk (for eventual use in a CD ROM) or floppy 
disk (for eventual use in a floppy disk drive), or even 
downloaded via the Internet. 

[0071] In addition, although the various methods de- 
scribed are conveniently implemented in a general pur- 
pose computer selectively activated or reconfigured by 
software, one of ordinary skill in the art would also rec- 
ognize that such methods may be carried out in hard- 
ware, in firmware, or in more specialized apparatus con- 
structed to p rformth required method st ps. 
[0072] Further, although the invention has been de- 
scribed in t rms of a preferred embodim nt in a specific 
network nvironm nt, those skilled in th art will recog- 



niz that the inventive diagnostic techniqu should be 
us ful in any distributed network environment. 



s Claims 

1. A task discovery method operative in a distributed 
computer network in which a management infra- 
structure is supported, comprising the steps of: 

10 

selecting at least one software agent from a set 
of software agents; and 

migrating the selected software agent across a 
is given set of nodes in the computer network to 

identify target machines for task deployment. 

2. The method according to Claim 1 wherein the se- 
lecting step includes identifying a given character- 

20 isttc of the task and selecting the software agent 
based on the given characteristic. 

3. A method of discovery in a distributed computer net- 
work having a management server servicing a set 

25 of machines, comprising the steps of: 

deploying instances of a runtime engine across 
a subset of the machines to create a distributed 
runtime environment in the distributed compu- 
30 ter network; 

in response to a given occurrence, deploying a 
discovery agent into the computer network from 
a source; and 

35 

at a given machine supporting an instance of 
the runtime engine, executing the discovery 
agent using the runtime engine to perform a dis- 
covery operation. 

40 

4. The discovery method according to Claim 3 wherein 
the discovery operation is selected from a group of 
discovery operations consisting of identifying ma- 
chines suitable for a task deployment, identifying a 

45 set of resources associated with a machine, identi- 
fying a machine type, and identifying a given char- 
acteristic of a resource at a machine. 

5. The discovery method according to Claim 3 further 
so including the step of collecting information discov- 
ered by the discovery agent. 

6. The discovery method according to Claim 5 further 
including the step of returning the discovery agent 

ss tothesourc . 

7. Th discov ry method according to Claim 3 further 
including the step of cloning the discovery agent at 
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the machine. 

8. Th discovery m thod according to Claim 7 further 
including the step of launching th cloned discov ry 
agent along a n w path in th comput r network. 

9. A method of discovery according to claim 3, wherein 
said occurrence is a task deployment request in re- 
sponse to which the discovery agent is migrated 
across a given set of nodes in the computer net- 
work; and the execution of the discovery agent 
takes place in the runtime environment at each 
node at which the discovery agent is received to 
perform the following steps: 

(a) determining whether the machine at the 
node is a target of the distribution request; 

(b) identifying a given subset of nodes associ- 
ated with the node that remain candidates for 
the distribution request; 

(c) deploying the discovery agent to the given 
subset; and 

(d) repeating steps (a)-(c) until the targets are 
identified or all network paths are exhausted. 

10. The method according to Claim 9 further including 
the step of compiling a list of machines that are to 
receive the distribution request. 

11. The method according to Claim 9 further including 
the step of displaying a list of the machines identi- 
fied. 

12. The method according to Claim 9 wherein at least 
one of the discovery agents is customized based on 
the task deployment request. 

13. The method according to Claim 3 or 9 wherein the 
runtime environment comprises a runtime engine 
and each discovery agent is a set of one or more 
tasks executable by the runtime engine. 

14. The method according to Claim 13 wherein the 
computer network is the Internet, the runtime en- 
gine is associated with a browser and the discovery 
agent is an applet. 

15. An apparatus connectable into a large distributed 
enterprise having a management server servicing 
a set of endpoint machines for effecting a discovery 
operation, comprising: 



means, responsive to a given occurrence, for 
selecting a software agent executable by the 
runtim ngin at a grv n endpoint machin ; 
and 

5 

means for deploying the selected software 
agent into the computer network to perform a 
discovery operation. 

10 16. A discovery system connectable into a large distrib- 
uted enterprise having a management server serv- 
icing a set of endpoint machines for deploying a 
task, comprising: 

'5 a plurality of instances of a runtime engine each 

supported on a given endpoint machine; and 

means, responsive to a discovery request, for 
dispatching a set of one or more software 
20 agents into the distributed enterprise to identify 

machines that satisfy a given criteria, wherein 
a given software agent is executable by the 
runtime engine at a given endpoint machine. 

2S 17. The system according to Claim 16 wherein the giv- 
en criteria is a determination that the endpoint ma- 
chine is a candidate for the discovery operation. 

18. The system according to Claim 16 further including 
30 means for generating software agents. 

19. The system according to Claim 18 wherein the gen- 
erating means includes means for customizing a 
given software agent as a function of the discovery 

35 operation. 

20. A computer program product in a computer-reada- 
ble medium for use in a computer having a proces- 
sor, a memory, and means for connecting the com- 

40 puter into a large distributed enterprise having a 
management server, the computer program product 
comprising: 

a runtime engine downloaded to the computer 
45 during a first operation; and 

a software agent deployed to the computer dur- 
ing a discovery operation and being executable 
by the runtime environment to discover whether 
so the computer satisfies a given criteria. 

21. A computer program product in a computer-reada- 
ble medium for use in a computer having a proces- 
sor, a memory, and means for connecting the com- 
puter into a large distributed computer network, the 
comput r n twork having a management server 
servicing a s t of machines, the computer program 
product comprising: 



55 

a plurality of instanc s of a runtime eng in .with 
ach instanc supported at a given ndpoint 
machine; 
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a plurality of instances of a runtim ngine, 
each runtime engine for us at a given ma- 
chin ; 

as t of software agents, ach of the softwar s 
agents comprising a set of one or more tasks; 

means, responsive to a given request, for se- 
lecting a software agent to be deployed into the 
network, the software agent being executable '<> 
by the runtime engine at a given endpoint ma- 
chine to determine whether the given endpoint 
machine is a candidate to receive a task to be 
subsequently deployed. 
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